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Question 1 | [9 Marks] 
Answer each of the following multiple choice questions by clearly writing the question 


number and part in your answer book followed by the letter corresponding to your answer. 
(For example: Q1 (i) A.) 


(i) Which one of the following probabilities is a cumulative probability? 


A. The probability that there are exactly 4 people with Type O+ blood in a sample 
of 10 people. 


B. The probability of exactly 3 heads in 6 flips of a coin. 


C. The probability that the accumulated annual rainfall in a certain city next year, 


rounded to the nearest millimetre, will be 450mm. 


D. The probability that a randomly selected woman’s height is 1.7metres or less. 
[1 Mark/ 


(ii) A comparison is to be made between the proportion of second graders that cannot 
read at second grade level and the proportion of third graders that cannot read at 
second grade level. School records from schools across the state are collected and 
records for 123 second graders and 146 third graders are randomly selected. Of the 
sampled second graders, 25 seem to be not reading at second grade level. Of the 


sample third graders, 26 do not read at second grade level. 


What is the correct notation for the difference 25/123 - 26/146? 


A. [1 — pe 
B. %1 — Xe 
C. pi — pa 
D. p, — pe [1 Mark] 


Question 1 is continued on page 3 
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Question 1 continued 
(iii) The average time taken to complete an exam follows a Normal probability distribution 
with mean = 60 minutes and standard deviation 30 minutes. 
What is the probability that a randomly chosen student will take more than 30 
minutes to complete the exam? 
A. 0.997 
B. 0.84 
C. 0.5 
D. 0.16 | [1 Mark] 
(iv) A statistics unit has 4 tutors: three female assistants (Lauren, Rona, and Leila) and 


one male assistant (Josh). Each tutor teaches one tutorial group. A student selects 


a tutorial group. 


The two events W = {the tutor is a woman} and J = {the tutor is Josh} are 


A. simple events. 

B. complementary events 

C. mutually exclusive events. 

D. independent events. [1 Mark] 
(v) If the confidence level for a confidence interval is increased, which of the following 

must also increase? 

A. sample estimate 

B. standard error 
C. multiplier 
D 


. none of the elements would be increased. [1 Mark/ 


Question 1 is continued on page 4 
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Question 1 continued 


(vi) A sample of 10 plants was taken and the mean height was 80.15cm, with a sd of 


7.1em. A 95% confidence interval for the true mean height of plants of that particular 


species is (75.lcm, 85.2cm). Four students gave the following interpretations of the 


confidence interval. Which of the following is correct? 


A. 


D. 


We are 95% confident that the interval (75.1, 85.2) will include the true mean 
height. 


. We are 95% confident that the true mean height is 80.15cm since that value lies 


in the confidence interval. 


. We can be fairly confident that 957% of all plants of that species have a height 


between 75.lem and 85.2cm. 


None of the above. [1 Mark/ 


(vii) Which of the following studies describes a paired data design? 


A. 
B. 


The testosterone levels of male doctors and male college professors are compared 


40 students measure their blood pressure twice—first while resting and then again 


after running in place for 10 minutes. 


. The mean blood pressure of men is compared to the mean blood pressure of 


wornern» 


. All of the above involve paired data. /1 Mark/ 


Question 1 is continued on page 5 
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Question 1 continued 


(viii) Which one of the following choices describes a problem for which an analysis of 


variance would be appropriate? 


A. Analyzing the relationship between gender and opinion about capital punish- 


ment (favor or oppose). 


B. Comparing the mean birth weights of newborn babies for three different racial 


groups. 
C. Analyzing the relationship between high school results and university results. 


D. Comparing the proportion of successes for three different treatments of anxiety. 
Each treatment is tried on 100 patients. [1 Mark] 


(ix) Which of the following best describes the standardized (z) score for an observation? 


A. It is the center of the list of scores from which the observation was taken. 
B. It is one standard deviation more than the observation. 
C. It is the most common score for that type of observation. 


D. It is the number of standard deviations the observation falls from the mean. 
[1 Mark/ 
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Question 2 [7 Marks] 


Resting heart rate was measured for subjects prior to drinking 200ml of coffee. Ten 
minutes later their heart rates were measured again. The change in heart rate, X, follows 
a normal distribution, with a mean increase of 7.3 beats per minute (bpm) and a standard 


deviation of 11.1 bpm. 


(a) Find the probability that X, the increase in heart rate is, greater than 13bpm. You 


may use 


and refer to Figure 1 on p. 7. [8 Marks] 


(b) Assume that a random sample of 16 was taken from the population, and the sample 


mean calculated. 


(i) State the sampling distribution of the sample mean X. [2 Marks] 


(ii) Will the probability that the sample mean is greater than 13bpm be larger or 
smaller than the probability found in (a)? Explain your answer, noting that 


you do not have to calculate this probability. [2 Marks] 
Hint: A sketch of the relevant distributions may help. 
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Figure 1: Standard Normal Curve 
Question 3 [4 Marks] 


Use sketches (either smooth probability distributions or histograms) to illustrate distribu- 


tions with the following shapes: 


(a) Symmetrical. [1 Mark] 
(b) Positively skewed. [1 Mark/ 
(c) Negatively skewed. /1 Mark/ 
(d) Bimodal. [1 Mark] 
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Question 4 [7 Marks] 
Table 1 gives the counts of fatal car accidents in the U.S. during the 1995-1999 period. 
These counts are categorized according to travel speed level (Low, Moderate, Fast, Very 
Fast) and the car make (Ford, Other). The counts express the number of fatal accidents 


that cannot be attributed to tyre punctures and involve compact Sports Utility Vehicles. 


Speed Level (mph) Ford Other 


Low (0-40) 171 465 
Moderate (41-55) 243 753 
Fast (56-65) 98 325 


Very Fast (> 65) 108 273 


Table 1: Fatal car accidents in the U.S. between 1995-1999. 


(a) What type of variable is “speed level” in this problem? [1 Mark/ 


(b) Name a graphical summary that could help you to visualize the distribution of the 
speed levels of Ford cars which were involved in a fatal accident based on the counts 


above. 


[1 Mark/ 


(c) We want to investigate if the speed level is related to the Ford car manufacturer. We 


continue the analysis using a Chi-squared (x7) test. 


(i) Explain why the y*-test is appropriate in this setting. [1 Mark] 
(ii) Formulate the Null and the Alternative hypothesis. [2 Marks] 
(iii) Given the following output state your conclusion. [2 Marks] 


Pearson’s Chi-squared test 


data: as.table(acc) 


X-squared = 4.1191, df = 3, p-value = 0.2489 
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Question 5 [8 Marks] 
Table 2 shows the number of hours of revision and the percentage scored in an end-of-term 


maths exam by a sample of 7 students. 


Exam Score (%) 49 35 57 58 50 34 67 87 
Hours of revision 1 05 5 2.5 T 8 8 ll 


Table 2: Exam scores and revision hours for 7 students. 


The teacher fits a linear regression model to predict the final exam score based on the 
hours of revision and the following output is obtained: 
Residuals: 


Min 1Q Median 3Q Max 
-13.8875 -5.7688 0.6375 8.4375 12.0375 


Coefficients: 


Estimate Std. Error t value Pr(>|tl) 


(Intercept) 36.337 6.380 5.695 0.00127 
hours 3.850 1.087 3.541 0.01220 


Residual standard error: 10.6 on 6 degrees of freedom 
Multiple R-squared: 0.6764,Adjusted R-squared: 0.6224 
F-statistic: 12.54 on 1 and 6 DF, p-value: 0.0122 


(a) State the equation of the model above. [1 Mark] 
(b) Interpret the intercept and slope coefficients. [2 Marks/ 


(c) Predict, the score of a student who has revised for 10 hours. Do you trust the 


prediction? Explain your response. [2 Marks/ 


(d) Construct a 95% confidence interval for the slope and give an interpretation. You 


can use the formula: 
slope + t* x (standard error of slope), where t* = 2.45. 


[3 Marks] 
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Question 6 [7 Marks] 
Samples that contained 4 mg of an antihistamine (Chlorphenamine) were sent to 7 different 
laboratories, 1 per lab, in order to assess the consistency between the laboratories and the 
variability of measurements. All samples were identical and each lab was asked to make 10 
determinations. The Analysis of variance (ANOVA) methodology seems to be appropriate 


for our problem. 


(a) Figure 2 shows the boxplots of each group of determinations per lab. Using the 


figure, comment on the consistency and variability of the measurements. /2 Marks/ 


Boxplots of the measurents of 7 labs. 


Amount of Chlorpheniramine maleate (mg) 


Lab 1 Lab 2 Lab 3 Lab 4 Lab 5 Lab 6 Lab 7 


Figure 2: Boxplots of Chlorphenamine measurements. 


(b) State the null and the alternative hypothesis in words of the ANOVA test using the 


setting of our problem. You can assume a significance level a= 0.05. /2 Marks/ 


Question 6 is continued on page 11 
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Question 6 continued 


(c) Using the ANOVA table bellow, conclude if there is evidence to suggest that the 


mean Chlorphenamine determination varies accross the labs. [3 Marks] 
Pr (>F) 


0.0001 


Df Sum Sq Mean Sq F value 
6 0.1247 0.020790 5.66 
63 0.2314 0.003673 


lab 


Residuals 


Question 7 [8 Marks] 
Are any physiological indicators associated with schizophrenia? In a 1990 article, re- 
searchers reported the results of a study that controlled for genetic and socioeconomic 
differences by examining 15 pairs of monozygotic twins, where one of the twins was 
schizophrenic and the other was not. The researchers used magnetic resonance imaging to 
measure the volumes (in cm*) of several regions and subregions of the twins’ brains. We 
are interested to test whether the volumes differ significantly between the two groups. ‘The 
numerical summaries of the two groups and their paired differences (Unaffected-Affected) 


are given bellow: 


mean - sd TQR O% 75% 100% n 
Affected 1.5600000 0.3012593 0.470 1.02 1.780 2.02 15 


Unaffected 1.7586667 0.2424242 0.335 1.25 1.935 2.08 15 
Diff 0.1986667 0.2382935 0.260 -0.19 0.315 0.68 15 


(a) Using numerical summaries above comment on any differences between the brain 


volumes of the two groups? [1 Mark/ 
(b) Explain why a paired t-test is appropriate for this problem. [1 Mark] 


(c) Let d, ug represent the sample and the population mean of the differences respectively. 
State the null and the alternative hypothesis of our test using statistical notation. 
[1 Mark] 


Question 7 is continued on page 12 
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Question 7 continued 


(d) Calculate the t statistic t using the formula: 


d— 
—_ Ld 


— 8a//n’ 


where sq is the sample standard deviation of the differences and n is the number of 


observations. 


[3 Marks] 


(e) Using Figure 3 and the t statistic from part (d), write your conclusion about the 


brain volume differences between the two groups. [2 Marks] 


CDF of Student's t distribution with 14 d.f. 


Cumulative Probability 


Figure 3: CDF of Student’s t distribution with 14 degrees of freedom. 
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Formulae are given on page 14 


Please remember: This examination question paper MUST BE HANDED IN. Failure 


to do so may result in the cancellation of all marks for this examination. Writing your 


name and number on the front will help us confirm that your paper has been returned. 
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Formulae 
P(A and B 
P(A|B) = ( Ey P(A|B) = P(A) if A and B are independent 
P(A and B) = P(A)P(B) if A and B are independent 
L— pb aso 
Z = Z — 
o ao//n 
( se() = —— 
Vi 
d — bg Sd 
t= se(d) = —= 
Sq 
i ve 
MSE 
Z+t”* x se(Z) se(Z) = 77 se(Z;) 7 


C= = £ se(v, —Zo)=4/—+— 
se(%1 — £2) (Z1 £2) ny n2 


| 1 1 
(Z1 —%2)tt* x se(X  — £2) se(Zy — Z2) = s2(— a5 —) 
41 2 


go — (m—1)8i + (m2 = 1)89 
Pp 


ntn2—2 
A(1 — B Tae 
poz" a) margin of error = z* Bae?) 
" n 
(1% ie ca 
(p1 — po) + 2” x se(pi — po) se(i, — Bo) = pi(1 — pi) # p2(1 — po) 
M4 n92 
Orme ee 
v= >! a d EB = np; 
4 
2 ot (O74, — E;;) ee R; x C; 
s os Bij tJ n 
i,j s 
b 
t= — 
se(bj) 
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